Next-generation sequencing shows the genomic features of ovarian clear cell cancer and compares the genetic architectures of high-grade serous ovarian cancer and clear cell carcinoma in ovarian and endometrial tissues

Ovarian clear cell carcinoma (OCCC) is a special histological type of epithelial ovarian cancer (EOC) that is not derived from epithelial cells of the ovarian or fallopian tube as the most common type of ovarian cancer, high-grade serous ovarian carcinoma (HGSOC), but is closely related to endometriosis and similar to endometrial clear cell carcinoma (ECCC) at morphologic and phenotypic features. However, limited data was shown in OCCC genomic features and compared with that in OCCC, HGSOC and ECCC. Herein, we utilized next-generation sequencing analysis of a panel of 1,021 genes to profile the mutational alterations in 34 OCCC and compared them to those from HGSOC (402 cases) and ECCC (30 cases). In result, the ARID1A and PIK3CA are high-frequency mutations of OCCC. Clonal architectures showed that all the mutations of genes occur in the later stage in the OCCC progress, whereas KRAS mutation is the earlier event compared with mutation of ARID1A or PIK3CA, which usually occurs in a group of ARID1A or PIK3CA mutations. The mutation frequency of main driver genes is similar between OCCC and ECCC, while TP53 is the main mutation in HGSOC and ECCC. Shared mutational signatures between OCCC and ECCC tissues with commonly observed a C>T change indicated a common carcinogens-exposed between these two carcinomas, but HGSOC and ECCC have common and distinct mutational signatures across cohorts respectively. In addition, we identified some novel CNV gains in NF1, ASXL1, TCF7L2, CREBBP and LRP1B and loss in ATM, FANCM, RB1 and FLT in OCCC. Our study offered a new perspective for OCCC tumorigenesis from two organs, the ovary and uterus, at genomic architectures and revealed novel CNV events for helping to provide theoretical support for OCCC treatment.


INTRODUCTION
Ovarian clear cell carcinoma (OCCC) is a specific histological type of ovarian malignant tumor (WHO, 2020), accounting for 5% to 25% of epithelial ovarian cancer (EOC), especially in Asians, it has a rising and younger trend year by year (Anglesio et al., 2011;Machida et al., 2019;Pozzati et al., 2018). Unlike high-grade serous ovarian cancer (HGSOC), OCCC is not originated from epithelial cells of the ovarian or fallopian tube but is closely related to endometriosis (Gadducci, Lanfredini & Tana, 2014;Jenison et al., 1989;Munksgaard & Blaakaer, 2012). Moreover, its morphologic and phenotypic features are more similar to those in endometrial clear cell carcinoma (ECCC) (Ju et al., 2018;Travaglino et al., 2020). Although previous work tried to reveal the tumorigenic mechanism of OCCC at the genomic and molecular level, the large contents remain obscure.
High-frequency somatic mutations in OCCC include AT-rich interaction domain 1A (ARID1A) and phosphatidylinositol-4,5-bisphosphate3-kinase (PIK3) catalytic subunit alpha (PIK3CA), that is similar in endometriosis without cancer, while the common mutation in HGSOC and ECCC is tumor protein 53 (TP53) (Anglesio et al., 2017;Baniak et al., 2019;Wiegand et al., 2010). To date, there are no approved specific targeted therapies for OCCC. Patients with OCCC currently received the same chemotherapy regimen as HGSOC: cisplatin-based combination chemotherapy. However, the effective rate of cisplatin in the treatment of OCCC is only 11%-50%, and the majority of patients with OCCC are resistant to cisplatin (Crotzer et al., 2007;Takano, Tsuda & Sugiyama, 2012). The survival rate of patients with OCCC in the advanced stage is much lower than other EOC subtypes (Sugiyama et al., 2000). Therefore, to improve the prognosis of patients with OCCC, it is necessary to strengthen the research on the pathogenesis of OCCC and develop a more effective therapeutic strategy for OCCC.
Herein, we detected the mutational features of 68 tissue samples (tumor and matched normal tissue) from 34 patients with OCCC by using a 1,021-gene panel of next-generation sequencing, further integrated sequencing data (MSK panel) of 30 patients with ECCC and whole-exome sequencing (WES) data of 402 patients with HGSOC from public databases of The Cancer Genome Atlas (TCGA), exhibiting common and unique genetic alteration, including clonal architecture, mutation signatures and copy number variations (CNV) in OCCC, HGSOC, and ECCC to illuminate the underlying mechanisms of OCCC tumorigenesis and progress.

Sample collection
The ECCC sequencing data (MSK panel) were downloaded from the literature (DeLair et al., 2017) and the HGSOC sequencing data were downloaded from the cBioportal database (http://www.cbioportal.org/). Moreover, the OCCC data (1,021 panel) were collected from the Geneplus genomic data bank from May 2015 to August 2022. The detailed clinicopathological and sample information were shown in Table S1. The retrospective study was designed and conducted in accordance with the Declaration of Helsinki. Written informed consent was granted in sample collection, gene sequencing, and data analysis, with the obtained information authorized for publishing. It was approved by the Institutional Review Board (IRB) of Taizhou Hospital of Zhejiang Province (No. K20220844).

Targeted capture sequencing
A tissue kit was used to extract genomic DNA (gDNA) from OCCC samples and matched normal tissues (Qiagen, Hilden, Germany). Following the manufacturer's instructions, sequencing libraries were created using the KAPA DNA Library Preparation Kit (Kapa Biosystems, MA, USA). For both biopsy samples, barcoded libraries were hybridized to a panel of 1,021 genes with full exons, chosen introns from 288 genes, and selected regions from 733 genes. The comprehensive gene list was described in our earlier research (Wang et al., 2020). The DNBSEQ-T7RS (BGI, Shenzhen, China) with 100 bp paired-end reads was used to sequence the DNA.

Mutation, somatic interactions, and somatic copy number variation calling
MuTect was utilized to identify minor insertions and deletions (Indels), as well as single nucleotide variations (SNVs) (version 1.1.4). For quality control, somatic mutations were only found if they met the following criteria: (i) they were present in less than 1% of the population in the 1000 Genomes Project, the Exome Aggregation Consortium, and the Genome Aggregation Database (gnomAD) (https://gnomad.broadinstitute.org); (ii) they weren't present in paired germline DNA from normal tissues; and (iii) they were found in at least five high-quality reads and without paired-end reads bias. The maftools was used to delineate the mutational landscape (Mayakonda et al., 2018).
The somatic interactions landscape was drawn by maftools (Mayakonda et al., 2018) to identify gene sets mutated in a mutually exclusive or co-occurring manner. A Fisher's exact test on a 2*2 contingency table including frequencies of mutant and nonmutated samples is used to assess the pattern of exclusivity or co-occurrence for a pair of genes.
Focal somatic copy number variation (SCNV) was identified by CONTRA (Li et al., 2012) and the frequency of larger fragmental CNV was calculated. To compare the mutational frequency across each tumor subtype, the overlapped region was obtained from each panel. The top10 CNV genes were extracted from the HGSOC segment file and OCCC 1021 panel results.

Analysis of mutational signature and clonal architecture
The mutational signature analysis was performed with unfiltered somatic mutations using the R package YAPSA (Hubschmann et al., 2021) and matched to the COSMIC signature database (https://cancer.sanger.ac.uk/cosmic/signatures, Mutational Signatures V3.3).
The variant allele frequency (VAF) ratio of mutations was used to infer the clonality of mutational events (Gerlinger et al., 2012). Specifically, the VAF ratio was calculated by dividing the VAF of each mutation by the maximum VAF observed in the same sample. A higher VAF ratio suggested the respective event occurred at an earlier stage during tumor progression. Mutations with VAF ratios >0.6 were determined as clonal mutations while the rest were considered subclonal mutations.

Statistics
Two-sided Mann-Whitney and Fisher's exact tests were performed on GraphPad Prism (version 8.02) or R (version 3.6.1) to generate the P value. For all tests, a P value <0.05 was considered statistically significant.

Clonal architectures in OCCC, HGSOC, and ECCC
To explore the probable timing order of the mutation events arose in OCCC, HGSOC, and ECCC, we gauged the variant allele frequency (VAF) ratio of somatic mutations and Frequency (  Indels in these tumors with the method mentioned in previous literature (Gerlinger et al., 2012). The median of high/low VAF represents an early or late event, respectively. As shown in Fig. 3, driver genes like TP53 and RB1 were determined as clonal mutations with high VAF ratios (>0.6), indicating they might occur at an earlier stage compared with other subclonal mutations with low VAF ratios (<0.5), and might play crucial roles in HGSOC tumorigenesis. By contrast, the median VAF value of all genes was less than 0.5 in OCCC and ECCC, suggesting that all alterations were the later events in these two malignancies as subclonal mutations. Although the mutational frequency of ARID1A and PIK3CA were higher than other genes, the median VAF value of KRAS was highest and the  TP53 mutation arose later in OCCC. In ECCC, the median VAF values of TP53, ARID1A, KRAS and PIK3CA mutation were low as later events. These results suggested that the driver genes triggered cancer in OCCC, ECCC, and HGSOC were different, and different mutations of genes were presented in these tumors at different stages, some mutations were newly obtained during OCCC and ECCC malignant evolution.

Mutational signatures in OCCC, HGSOC, and ECCC
The mutational signatures can reflect the potential carcinogens exposed in tumorigenesis and progress. We identified somatic SNVs and Indels corresponding Catalogue of Somatic Mutations in Cancer (COSMIC) single base signatures (SBS) and insertion and deletion signatures (ID) in OCCC. Somatic SNVs were mainly attributed to two SBSs in the COSMIC database, SBS7b, and SBS31 specifically, while Indels were mainly attributed to ID5, ID11, ID16, and ID17 (Fig. 4A). We then identified the SNVs mutational signatures patterns of OCCC, HGSOC, and ECCC to gain insight into the different etiology of those tumors. As shown in Fig. 4B, mutation signatures of OCCC were shared with ECCC, including SBS31 and SBS7b. C >T (cytosine>thymine transitions) mutational pattern of SBS7b was frequently found in cancers of the skin and associated with exposure to ultraviolet light (Alexandrov et al., 2020). SBS31, characterized by C >T mutations, may be due to platinum drug treatment. The proposed etiology of ECCC-specific SBS23 of the C >T pattern is still unknown. SBS39, featured with predominant C >G alterations, was shared in HGSOC and ECCC. It was found with high distribution in head and squamous cell carcinoma, breast cancer, and ovary-adenoma and without known proposed etiology (Alexandrov et al., 2020). SBS1 and SBS38 were private signatures in HGSOC. C >T change of SBS1 is an endogenous mutational process initiated by spontaneous or enzymatic deamination of 5-methylcytosine to thymine which generates G: T mismatches in double-strand DNA, which is considered as a cell division/mitotic clock. SBS38, a pattern of C>A mutation,  ID1  ID2  ID3  ID5  ID7  ID8  ID9  ID11  ID13  ID14  ID16  was found only in melanomas with potential indirect damage from ultraviolet light. These results suggested that the potential mutational signature patterns in OCCC were more similar to ECCC than that in HGSOC.

DISCUSSION
In our study, we identified firstly the genomic features of OCCC, then distinguished the similarities and differences of the genetic architectures and clonal patterns in OCCC, ECCC, and HGSOC. In result, we elaborated on the similarity of mutation rate of key  driver gene, mutational signature, and clonal architectures between OCCC and ECCC, and different genetic features between OCCC and HGSOC. Importantly, our data emphasized KRAS potential role in samples with ARID1A or PIK3CA mutation in OCCC, identified LRP1B mutation group only occurred in samples with non-ARID1A truncation mutation, distinguished the potential carcinogens-exposed of those tumors by mutational signatures and unearthed several novel drivers in CNV level that might essentially contribute to OCCC tumor progression. The frequent mutation genes in OCCC are PIK3CA (Kato et al., 2019;Kuo et al., 2009) andARID1A (Katagiri et al., 2012;Wiegand et al., 2010), which are frequently cooccurrence (Oliveira et al., 2021). In our results, the coexistence of PIK3CA and ARID1A mutations accounted for 29% (10/34) of all OCCC samples, while mutation rates of TP53 and KRAS are 38% and 21%, respectively. We found that LRP1B was found only in samples without ARID1A truncating mutations in OCCC. As a tumor suppressor, LRP1B mutation was associated with favorable outcomes to immune checkpoint inhibitor across multiple cancer types (Brown et al., 2021), and LRP1B protein was a predictor of response to pegylated liposomal doxorubicin in patients with ovary cancer (Dionisio de Sousa et al., 2021). However, the ARID1B/CREBBP/MET/RBM10 mutation only occurred in the samples with ARID1A truncating mutation which is related to ARID1A absence or low expression. Jung et al. (2021) reported that low ARID1A expression correlated with poor overall survival of OCCC patients (1011 cases) by a meta-analysis. These results provided an experimental basis for the treatment choice of OCCC.
In Fig. 1B, TP53 is the main mutation and associated with DNA damages in HGSOC and ECCC (Baniak et al., 2019;Kroeger Jr & Drapkin, 2017;Vang et al., 2016). The histological and molecular phenotypes are similar in the expression of TP53, ER, PR, HIF1 β and napsinA between OCCC and ECCC (Ju et al., 2018;Lim et al., 2015), while mutation frequencies of ARID1A in our OCCC cohort were observed significantly higher than in ECCC cohort, and mutational frequencies in some canonical cancer driver genes are similar between these two cohorts, such as PIK3CA, TP53, KRAS, APC, and KMT2C, indicating that they might play pivotal roles in clear cell carcinoma tumorigenesis. In addition, we found a mutually exclusive relationship between DNMT3A and PIK3CA in OCCC. TP53 and FAT1 were mutually exclusive in HGSOC (Fig. 2). The mutually exclusive relationship in genes is associated with tumor types (Cisowski & Bergo, 2017). These results indicated that although there was a similar genetic underpinning between OCCC and ECCC, the major gene mutation was ARID1A in OCCC, while in ECCC and HGSOC, the dominant gene mutation was TP53, which may be associated with organ selection (ovary or uterus) for tumorigenesis.
ARID1A did not lead to tumor formation by itself, and the coexistent ARID1A-PIK3CA mutations promote OCCC tumor formation (Chandler et al., 2015;Yamamoto et al., 2012). In our result, all mutations of genes were considered as later events in OCCC tumorigenesis due to the subclone architecture by analyzing the VAF values of mutations in OCCC. Although the mutation frequencies of ARID1A and PIK3CA are the highest, the VAF value of KRAS mutation is the highest, indicating that KRAS mutation is the earlier event compared with ARID1A and PIK3CA in OCCC tumorigenesis. Notably, the KRAS mutation occurs only in the group of ARID1A or PIK3CA mutations, implying that KRAS mutation might play a crucial role in ARID1A or PIK3CA triggered OCCC as an early driver mutation. In the HGSOC, the clonal mutation genes were TP53 (Vang et al., 2016) and RB1, which occurred in an earlier stage and initiated the HGSOC tumorigenesis, whereas the VAF value of TP53 mutation was less than 0.25 in OCCC, indicating a TP53 mutation is a later event in OCCC progress. The VAF value of mutations in ECCC was low like that in OCCC, the timing order of the mutation events arose in the tumor was different, and mutations such as TP53, PPP2R1A, HIF1A, KRAS, SPOP, MAP3K1, and PIK3CA are almost concurrent by the similar VAF value of mutation. These results suggested that based on the prodromal disease endometriosis, malignant changes were caused by mutations acquired in the later stages of OCCC, which is similar to ECCC and not completely different from HGSOC at gene clonal architecture.
SBS7b and SBS31 were characterized by C >T mutations in OCCC (Fig. 4A), are also shared SBSs in OCCC and ECCC (Fig. 4B), which was associated with exposure to ultraviolet light (Alexandrov et al., 2020) and cannot dismiss that the observed signature might be partially due to the patients with platinum drug treatment. ID5 was characterized by predominant single T base deletion, which was observed particularly predominant in OCCC (Fig. 4A), and was reported to be a clock-like signature associated with patients' ages (Alexandrov et al., 2020). Maru et al. (2017) reported that the mutation of the C to T transition was the most frequently observed in OCCC, but not demonstrated the potential carcinogens of OCCC carcinogenesis corresponding to this alternation. SBS39 featured with predominant C >G alterations, was shared in HGSOC and ECCC, which was also found with high distribution in head and squamous cell carcinoma, breast cancer, and ovary-adenoma and without known proposed etiology (Alexandrov et al., 2020), while the proposed etiology of ECCC-specific SBS23 of C >T pattern is still unknown. The potential carcinogens-exposed of HGSOC were associated with cell division/mitosis and potential indirect damage from ultraviolet light by analyzing private signatures of HGSOC, SBS1 and SBS38. Our findings and previous reports indicated that the potential mutational signatures in OCCC were more similar to ECCC compared with HGSOC. The discrepancy in signature distribution is considered reasonable given that OCCC and ECCC co-occur from the endometrium.

CONCLUSIONS
In summary, our study is a new attempt to compare simultaneously genetic features of OCCC to carcinoma of the uterus or ovaries to further dissected the mechanisms that might essentially contribute to OCCC tumorigenesis. Our data exhibited similar genomic features between OCCC and ECCC, and OCCC is significantly different from HGSOC at the genomic level. Importantly, we emphasized the role of key gene mutation in OCCC tumorigenesis and revealed clonal architecture and novel CNV events of OCCC. Our results provided a new experimental basis for therapeutic targets of OCCC.